Overview

Dataset statistics

Number of variables23
Number of observations5425
Missing cells11971
Missing cells (%)9.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory974.9 KiB
Average record size in memory184.0 B

Variable types

CAT15
NUM8

Warnings

centro_escolar_acceso has a high cardinality: 137 distinct values High cardinality
fecha_nacimiento has a high cardinality: 3690 distinct values High cardinality
municipio has a high cardinality: 435 distinct values High cardinality
tipo_traslado has 4525 (83.4%) missing values Missing
nota_acceso has 1578 (29.1%) missing values Missing
nota_admision_def has 2725 (50.2%) missing values Missing
centro_escolar_acceso has 2461 (45.4%) missing values Missing
cod_provincia has 162 (3.0%) missing values Missing
provincia has 162 (3.0%) missing values Missing
cod_municipio has 179 (3.3%) missing values Missing
municipio has 179 (3.3%) missing values Missing
fecha_nacimiento is uniformly distributed Uniform

Reproduction

Analysis started2021-05-14 10:00:05.285061
Analysis finished2021-05-14 10:00:13.656632
Duration8.37 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

expediente
Real number (ℝ≥0)

Distinct1718
Distinct (%)31.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean597.3996313
Minimum0
Maximum1851
Zeros3
Zeros (%)0.1%
Memory size42.4 KiB
2021-05-14T12:00:13.703116image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile24
Q1148
median458
Q3964
95-th percentile1596
Maximum1851
Range1851
Interquartile range (IQR)816

Descriptive statistics

Standard deviation508.7121391
Coefficient of variation (CV)0.8515441129
Kurtosis-0.6828524355
Mean597.3996313
Median Absolute Deviation (MAD)357
Skewness0.6934634878
Sum3240893
Variance258788.0405
MonotocityNot monotonic
2021-05-14T12:00:13.786561image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
9140.3%
 
10130.2%
 
28130.2%
 
18130.2%
 
3130.2%
 
11130.2%
 
6130.2%
 
2130.2%
 
20130.2%
 
16130.2%
 
Other values (1708)529497.6%
 
ValueCountFrequency (%) 
030.1%
 
1100.2%
 
2130.2%
 
3130.2%
 
4130.2%
 
ValueCountFrequency (%) 
18511< 0.1%
 
18471< 0.1%
 
18461< 0.1%
 
18441< 0.1%
 
18401< 0.1%
 

cod_plan
Real number (ℝ≥0)

Distinct18
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1627.792074
Minimum1621
Maximum1639
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T12:00:13.861373image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1621
5-th percentile1623
Q11626
median1627
Q31632
95-th percentile1635
Maximum1639
Range18
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.639889297
Coefficient of variation (CV)0.002236089827
Kurtosis0.04872817441
Mean1627.792074
Median Absolute Deviation (MAD)1
Skewness0.8025982261
Sum8830772
Variance13.24879409
MonotocityIncreasing
2021-05-14T12:00:13.920525image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%) 
1626141226.0%
 
163293517.2%
 
162774013.6%
 
162361711.4%
 
162856810.5%
 
16252725.0%
 
16242474.6%
 
16351562.9%
 
16311322.4%
 
16361021.9%
 
Other values (8)2444.5%
 
ValueCountFrequency (%) 
1621170.3%
 
1622100.2%
 
162361711.4%
 
16242474.6%
 
16252725.0%
 
ValueCountFrequency (%) 
1639521.0%
 
1638290.5%
 
16361021.9%
 
16351562.9%
 
1634661.2%
 

des_plan
Categorical

Distinct16
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
GRADO EN EDIFICACIÓN
1412 
GRADO EN INGENIERÍA INFORMÁTICA EN INGENIERÍA DEL SOFTWARE
935 
GRADO EN INGENIERÍA INFORMÁTICA EN INGENIERÍA DE COMPUTADORES
740 
GRADO EN INGENIERÍA CIVIL - CONSTRUCCIONES CIVILES
617 
GRADO EN INGENIERÍA DE SONIDO E IMAGEN EN TELECOMUNICACIÓN
568 
Other values (11)
1153 
ValueCountFrequency (%) 
GRADO EN EDIFICACIÓN141226.0%
 
GRADO EN INGENIERÍA INFORMÁTICA EN INGENIERÍA DEL SOFTWARE93517.2%
 
GRADO EN INGENIERÍA INFORMÁTICA EN INGENIERÍA DE COMPUTADORES74013.6%
 
GRADO EN INGENIERÍA CIVIL - CONSTRUCCIONES CIVILES61711.4%
 
GRADO EN INGENIERÍA DE SONIDO E IMAGEN EN TELECOMUNICACIÓN56810.5%
 
GRADO EN INGENIERÍA CIVIL - TRANSPORTES Y SERVICIOS URBANOS2725.0%
 
GRADO EN INGENIERÍA CIVIL - HIDROLOGÍA2474.6%
 
MÁSTER UNIVERSITARIO EN INVESTIGACIÓN EN INGENIERIA Y ARQUITECTURA1613.0%
 
MÁSTER UNIVERSITARIO EN INGENIERÍA DE TELECOMUNICACIÓN1562.9%
 
MÁSTER UNIVERSITARIO EN INGENIERÍA INFORMÁTICA1021.9%
 
Other values (6)2154.0%
 
2021-05-14T12:00:14.002425image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T12:00:14.079095image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length74
Median length58
Mean length46.58470046
Min length20
Distinct14
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
2010-11
1101 
2011-12
652 
2013-14
539 
2009-10
513 
2012-13
495 
Other values (9)
2125 
ValueCountFrequency (%) 
2010-11110120.3%
 
2011-1265212.0%
 
2013-145399.9%
 
2009-105139.5%
 
2012-134959.1%
 
2014-154448.2%
 
2015-163115.7%
 
2016-172775.1%
 
2017-182765.1%
 
2018-192684.9%
 
Other values (4)54910.1%
 
2021-05-14T12:00:14.153012image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T12:00:14.215308image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length7
Min length7

exp_cerrado
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
N
3192 
S
2233 
ValueCountFrequency (%) 
N319258.8%
 
S223341.2%
 
2021-05-14T12:00:14.281522image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T12:00:14.322345image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:14.364331image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

exp_trasladado
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
N
4525 
S
900 
ValueCountFrequency (%) 
N452583.4%
 
S90016.6%
 
2021-05-14T12:00:14.425935image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T12:00:14.469802image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:14.511504image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

tipo_traslado
Categorical

MISSING

Distinct3
Distinct (%)0.3%
Missing4525
Missing (%)83.4%
Memory size42.4 KiB
I
488 
S
219 
E
193 
ValueCountFrequency (%) 
I4889.0%
 
S2194.0%
 
E1933.6%
 
(Missing)452583.4%
 
2021-05-14T12:00:14.573590image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T12:00:14.617987image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:14.666948image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.668202765
Min length1

exp_bloqueado
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
N
5334 
S
 
91
ValueCountFrequency (%) 
N533498.3%
 
S911.7%
 
2021-05-14T12:00:14.730131image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T12:00:14.768733image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:14.809118image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct49
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
2009-10
620 
2008-09
539 
2010-11
531 
2011-12
379 
2012-13
307 
Other values (44)
3049 
ValueCountFrequency (%) 
2009-1062011.4%
 
2008-095399.9%
 
2010-115319.8%
 
2011-123797.0%
 
2012-133075.7%
 
2017-182765.1%
 
2007-082765.1%
 
2013-142725.0%
 
2014-152574.7%
 
2015-162504.6%
 
Other values (39)171831.7%
 
2021-05-14T12:00:14.886334image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique3 ?
Unique (%)0.1%
2021-05-14T12:00:15.041180image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length7
Min length7
Distinct17
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
JUN
3530 
SEP
1043 
EXT
 
288
FEB
 
253
OCT
 
90
Other values (12)
 
221
ValueCountFrequency (%) 
JUN353065.1%
 
SEP104319.2%
 
EXT2885.3%
 
FEB2534.7%
 
OCT901.7%
 
JUL831.5%
 
FEX430.8%
 
DIC390.7%
 
ENE260.5%
 
NOV100.2%
 
Other values (7)200.4%
 
2021-05-14T12:00:15.106224image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4 ?
Unique (%)0.1%
2021-05-14T12:00:15.170432image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

acceso
Real number (ℝ≥0)

Distinct11
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.367926267
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T12:00:15.239990image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q35
95-th percentile5
Maximum20
Range19
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.051944324
Coefficient of variation (CV)0.8665575245
Kurtosis10.12127567
Mean2.367926267
Median Absolute Deviation (MAD)0
Skewness2.117658595
Sum12846
Variance4.210475511
MonotocityNot monotonic
2021-05-14T12:00:15.336714image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
1340862.8%
 
5138925.6%
 
35279.7%
 
10380.7%
 
6270.5%
 
4140.3%
 
1780.1%
 
2070.1%
 
230.1%
 
72< 0.1%
 
ValueCountFrequency (%) 
1340862.8%
 
230.1%
 
35279.7%
 
4140.3%
 
5138925.6%
 
ValueCountFrequency (%) 
2070.1%
 
1780.1%
 
10380.7%
 
92< 0.1%
 
72< 0.1%
 

des_acceso
Categorical

Distinct11
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
Selectividad
3408 
Título Universitario
1389 
Formación Profesional
527 
Traslado de Expediente (Estudios Españoles)
 
38
Acceso a Segundo Ciclo
 
27
Other values (6)
 
36
ValueCountFrequency (%) 
Selectividad340862.8%
 
Título Universitario138925.6%
 
Formación Profesional5279.7%
 
Traslado de Expediente (Estudios Españoles)380.7%
 
Acceso a Segundo Ciclo270.5%
 
Mayores de 25/40/45 años140.3%
 
Bachillerato Sin Prueba de Acceso80.1%
 
Título de Bachiller Homologado (Extranjeros)70.1%
 
COU sin selectividad30.1%
 
Estudios universitarios extranjeros parcialmente convalidados2< 0.1%
 
2021-05-14T12:00:15.429307image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T12:00:15.506676image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length61
Median length12
Mean length15.32921659
Min length12

sub_acceso
Real number (ℝ≥0)

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.375115207
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T12:00:15.565480image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q35
95-th percentile6
Maximum99
Range98
Interquartile range (IQR)4

Descriptive statistics

Standard deviation11.0081061
Coefficient of variation (CV)2.51607228
Kurtosis67.54114125
Mean4.375115207
Median Absolute Deviation (MAD)1
Skewness8.188160732
Sum23735
Variance121.1783998
MonotocityNot monotonic
2021-05-14T12:00:15.623522image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
1203237.5%
 
5176032.4%
 
284915.6%
 
671013.1%
 
99701.3%
 
430.1%
 
31< 0.1%
 
ValueCountFrequency (%) 
1203237.5%
 
284915.6%
 
31< 0.1%
 
430.1%
 
5176032.4%
 
ValueCountFrequency (%) 
99701.3%
 
671013.1%
 
5176032.4%
 
430.1%
 
31< 0.1%
 

des_subacesso
Categorical

Distinct19
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
LOE (Grados)
1760 
LOGSE
841 
Curso de Adaptación
733 
Bachillerato LOMCE
710 
Titulado universitario
656 
Other values (14)
725 
ValueCountFrequency (%) 
LOE (Grados)176032.4%
 
LOGSE84115.5%
 
Curso de Adaptación73313.5%
 
Bachillerato LOMCE71013.1%
 
Titulado universitario65612.1%
 
Ciclos formativos4458.2%
 
.1021.9%
 
Formación Profesional II751.4%
 
Prueba de Acceso a la Universidad350.6%
 
COU310.6%
 
Other values (9)370.7%
 
2021-05-14T12:00:15.693663image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
2021-05-14T12:00:15.769156image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length45
Median length12
Mean length14.45953917
Min length1

nota_acceso
Real number (ℝ≥0)

MISSING

Distinct1573
Distinct (%)40.9%
Missing1578
Missing (%)29.1%
Infinite0
Infinite (%)0.0%
Mean6.615799324
Minimum0
Maximum12.272
Zeros2
Zeros (%)< 0.1%
Memory size42.4 KiB
2021-05-14T12:00:15.846786image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5.29
Q15.8255
median6.44
Q37.249
95-th percentile8.66
Maximum12.272
Range12.272
Interquartile range (IQR)1.4235

Descriptive statistics

Standard deviation1.056105224
Coefficient of variation (CV)0.1596338057
Kurtosis1.255282834
Mean6.615799324
Median Absolute Deviation (MAD)0.678
Skewness0.5318499533
Sum25450.98
Variance1.115358244
MonotocityNot monotonic
2021-05-14T12:00:15.933981image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7280.5%
 
6.5230.4%
 
5.8220.4%
 
5.75210.4%
 
5.9200.4%
 
6180.3%
 
6.25160.3%
 
5.6150.3%
 
6.69140.3%
 
5.65140.3%
 
Other values (1563)365667.4%
 
(Missing)157829.1%
 
ValueCountFrequency (%) 
02< 0.1%
 
1.3691< 0.1%
 
1.5591< 0.1%
 
1.7191< 0.1%
 
2.0171< 0.1%
 
ValueCountFrequency (%) 
12.2721< 0.1%
 
102< 0.1%
 
9.91< 0.1%
 
9.8591< 0.1%
 
9.81< 0.1%
 

nota_admision_def
Real number (ℝ≥0)

MISSING

Distinct1615
Distinct (%)59.8%
Missing2725
Missing (%)50.2%
Infinite0
Infinite (%)0.0%
Mean7.702148889
Minimum5
Maximum13.859
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T12:00:16.032003image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile5.3419
Q16.1285
median7.198
Q38.91125
95-th percentile11.75615
Maximum13.859
Range8.859
Interquartile range (IQR)2.78275

Descriptive statistics

Standard deviation1.987739052
Coefficient of variation (CV)0.258075906
Kurtosis-0.02239247871
Mean7.702148889
Median Absolute Deviation (MAD)1.268
Skewness0.8771500269
Sum20795.802
Variance3.95110654
MonotocityNot monotonic
2021-05-14T12:00:16.109900image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7110.2%
 
6.54100.2%
 
5.990.2%
 
6.4580.1%
 
5.680.1%
 
5.8580.1%
 
7.1580.1%
 
6.2380.1%
 
5.9280.1%
 
5.880.1%
 
Other values (1605)261448.2%
 
(Missing)272550.2%
 
ValueCountFrequency (%) 
540.1%
 
5.0041< 0.1%
 
5.0182< 0.1%
 
5.0230.1%
 
5.02430.1%
 
ValueCountFrequency (%) 
13.8591< 0.1%
 
13.81< 0.1%
 
13.6731< 0.1%
 
13.6141< 0.1%
 
13.5931< 0.1%
 

centro_escolar_acceso
Categorical

HIGH CARDINALITY
MISSING

Distinct137
Distinct (%)4.6%
Missing2461
Missing (%)45.4%
Memory size42.4 KiB
231-I.E.S. NORBA CAESARINA
 
185
230-I.E.S. EL BROCENSE
 
150
220-I.E.S. UNIVERSIDAD LABORAL
 
103
232-I.E.S. PROFESOR HERNÁNDEZ PACHECO
 
79
170-I.E.S. SANTA EULALIA
 
69
Other values (132)
2378 
ValueCountFrequency (%) 
231-I.E.S. NORBA CAESARINA1853.4%
 
230-I.E.S. EL BROCENSE1502.8%
 
220-I.E.S. UNIVERSIDAD LABORAL1031.9%
 
232-I.E.S. PROFESOR HERNÁNDEZ PACHECO791.5%
 
170-I.E.S. SANTA EULALIA691.3%
 
203-I.E.S. ALAGÓN631.2%
 
240-I.E.S. ÁGORA561.0%
 
210-I.E.S. LUIS DE MORALES521.0%
 
238-COLEGIO LICENCIADOS REUNIDOS500.9%
 
234-COLEGIO SAN ANTONIO DE PADUA470.9%
 
Other values (127)211038.9%
 
(Missing)246145.4%
 
2021-05-14T12:00:16.222363image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique6 ?
Unique (%)0.2%
2021-05-14T12:00:16.306330image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length40
Median length19
Mean length16.0764977
Min length3

sexo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
H
4311 
D
1114 
ValueCountFrequency (%) 
H431179.5%
 
D111420.5%
 
2021-05-14T12:00:16.376424image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T12:00:16.418392image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:16.460748image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

fecha_nacimiento
Categorical

HIGH CARDINALITY
UNIFORM

Distinct3690
Distinct (%)68.0%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
1992-06-13
 
9
1992-02-04
 
8
1990-10-15
 
8
1991-07-03
 
7
1992-03-30
 
7
Other values (3685)
5386 
ValueCountFrequency (%) 
1992-06-1390.2%
 
1992-02-0480.1%
 
1990-10-1580.1%
 
1991-07-0370.1%
 
1992-03-3070.1%
 
1992-04-2270.1%
 
1993-10-2470.1%
 
1992-12-1870.1%
 
1994-08-1160.1%
 
1995-06-1760.1%
 
Other values (3680)535398.7%
 
2021-05-14T12:00:16.640367image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2498 ?
Unique (%)46.0%
2021-05-14T12:00:16.722928image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

cod_provincia
Real number (ℝ≥0)

MISSING

Distinct50
Distinct (%)1.0%
Missing162
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean10.67832035
Minimum0
Maximum60
Zeros8
Zeros (%)0.1%
Memory size42.4 KiB
2021-05-14T12:00:16.788554image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q16
median10
Q310
95-th percentile31.9
Maximum60
Range60
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.64053009
Coefficient of variation (CV)0.8091656559
Kurtosis9.667099711
Mean10.67832035
Median Absolute Deviation (MAD)4
Skewness3.098695934
Sum56200
Variance74.65876023
MonotocityNot monotonic
2021-05-14T12:00:16.865081image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10240144.3%
 
6217240.0%
 
281382.5%
 
45541.0%
 
37510.9%
 
8480.9%
 
11440.8%
 
41410.8%
 
20250.5%
 
21240.4%
 
Other values (40)2654.9%
 
(Missing)1623.0%
 
ValueCountFrequency (%) 
080.1%
 
130.1%
 
21< 0.1%
 
390.2%
 
450.1%
 
ValueCountFrequency (%) 
60140.3%
 
521< 0.1%
 
5160.1%
 
5050.1%
 
4960.1%
 

provincia
Categorical

MISSING

Distinct50
Distinct (%)1.0%
Missing162
Missing (%)3.0%
Memory size42.4 KiB
CÁCERES
2401 
BADAJOZ
2172 
MADRID
 
138
TOLEDO
 
54
SALAMANCA
 
51
Other values (45)
447 
ValueCountFrequency (%) 
CÁCERES240144.3%
 
BADAJOZ217240.0%
 
MADRID1382.5%
 
TOLEDO541.0%
 
SALAMANCA510.9%
 
BARCELONA480.9%
 
CÁDIZ440.8%
 
SEVILLA410.8%
 
GIPUZKOA250.5%
 
HUELVA240.4%
 
Other values (40)2654.9%
 
(Missing)1623.0%
 
2021-05-14T12:00:16.953785image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4 ?
Unique (%)0.1%
2021-05-14T12:00:17.030404image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length22
Median length7
Mean length6.900276498
Min length3

cod_municipio
Real number (ℝ≥0)

MISSING

Distinct278
Distinct (%)5.3%
Missing179
Missing (%)3.3%
Infinite0
Infinite (%)0.0%
Mean233.7384674
Minimum1
Maximum912
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T12:00:17.103875image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median90
Q3450
95-th percentile760
Maximum912
Range911
Interquartile range (IQR)449

Descriptive statistics

Standard deviation265.2541775
Coefficient of variation (CV)1.13483322
Kurtosis-0.9772410625
Mean233.7384674
Median Absolute Deviation (MAD)89
Skewness0.6763307704
Sum1226192
Variance70359.77868
MonotocityNot monotonic
2021-05-14T12:00:17.181406image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1236143.5%
 
5843085.7%
 
4103065.6%
 
2151703.1%
 
5161252.3%
 
2641041.9%
 
7601021.9%
 
55621.1%
 
435531.0%
 
785440.8%
 
Other values (268)161129.7%
 
(Missing)1793.3%
 
ValueCountFrequency (%) 
1236143.5%
 
51< 0.1%
 
82< 0.1%
 
10150.3%
 
1230.1%
 
ValueCountFrequency (%) 
91230.1%
 
8881< 0.1%
 
8841< 0.1%
 
8752< 0.1%
 
86830.1%
 

municipio
Categorical

HIGH CARDINALITY
MISSING

Distinct435
Distinct (%)8.3%
Missing179
Missing (%)3.3%
Memory size42.4 KiB
CÁCERES
1365 
BADAJOZ
606 
PLASENCIA
308 
MÉRIDA
306 
DON BENITO
 
170
Other values (430)
2491 
ValueCountFrequency (%) 
CÁCERES136525.2%
 
BADAJOZ60611.2%
 
PLASENCIA3085.7%
 
MÉRIDA3065.6%
 
DON BENITO1703.1%
 
NAVALMORAL DE LA MATA1252.3%
 
MADRID1082.0%
 
CORIA1041.9%
 
VILLANUEVA DE LA SERENA1011.9%
 
ALMENDRALEJO621.1%
 
Other values (425)199136.7%
 
(Missing)1793.3%
 
2021-05-14T12:00:17.273328image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique170 ?
Unique (%)3.2%
2021-05-14T12:00:17.355642image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length26
Median length7
Mean length9.693271889
Min length3

Interactions

2021-05-14T12:00:07.432579image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:07.540693image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:07.611576image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:07.687279image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:07.757606image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:07.829691image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:07.898558image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:07.969399image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.038266image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.106702image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.250357image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.319398image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.383413image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.448555image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.512407image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.575969image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.638109image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.721187image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:08.892783image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.029572image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.115930image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.200073image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.282136image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.370183image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.460213image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.565676image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.635138image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.708364image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.775531image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.842734image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.909296image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:09.978620image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.053379image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.139656image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.208978image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.284081image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.353523image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.422142image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.487979image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.557462image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.705993image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.776441image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.841695image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.913230image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:10.999318image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.066456image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.135818image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.217404image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.280863image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.352785image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.420768image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.493201image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.561905image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.631551image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.700359image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.769141image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.836306image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.905750image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:11.972620image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:12.049916image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:12.119592image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:12.191075image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:12.263301image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:12.332503image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2021-05-14T12:00:17.414107image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-14T12:00:17.515927image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-14T12:00:17.617692image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-14T12:00:17.769041image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-05-14T12:00:18.058383image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-05-14T12:00:12.491508image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:12.915366image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:13.307269image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T12:00:13.503837image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

expedientecod_plandes_plananio_apertura_expedienteexp_cerradoexp_trasladadotipo_trasladoexp_bloqueadoanio_convocatoria_accesoconvocatoria_accesoaccesodes_accesosub_accesodes_subacessonota_accesonota_admision_defcentro_escolar_accesosexofecha_nacimientocod_provinciaprovinciacod_municipiomunicipio
021621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN2004-05SEP6Acceso a Segundo Ciclo1.NaNNaNNaNH1981-10-2310.0CÁCERES1.0CÁCERES
131621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN2000-01DIC6Acceso a Segundo Ciclo1.NaNNaNNaNH1977-07-0910.0CÁCERES1.0CÁCERES
251621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN1994-95FEB6Acceso a Segundo Ciclo1.NaNNaNNaND1970-08-266.0BADAJOZ345.0JEREZ DE LOS CABALLEROS
361621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08NNNaNN2003-04JUN6Acceso a Segundo Ciclo1.NaNNaNNaNH1978-11-166.0BADAJOZ740.0VILLAFRANCA DE LOS BARROS
471621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN1996-97OCT6Acceso a Segundo Ciclo1.NaNNaNNaNH1971-11-016.0BADAJOZ1.0BADAJOZ
581621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN2005-06SEP6Acceso a Segundo Ciclo1.NaNNaNNaNH1984-04-2610.0CÁCERES1.0CÁCERES
691621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN2006-07DIC6Acceso a Segundo Ciclo1.NaNNaNNaNH1984-05-306.0BADAJOZ760.0VILLANUEVA DE LA SERENA
7111621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08NNNaNN2006-07DIC6Acceso a Segundo Ciclo1.NaNNaNNaNH1979-06-2117.0GIRONA252.0FIGUERES
8121621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN1989-90JUN6Acceso a Segundo Ciclo1.NaNNaNNaNH1965-12-056.0BADAJOZ410.0MÉRIDA
9131621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN1987-88FEB6Acceso a Segundo Ciclo1.NaNNaNNaNH1963-12-1460.0EXTRANJEROSNaNNaN

Last rows

expedientecod_plandes_plananio_apertura_expedienteexp_cerradoexp_trasladadotipo_trasladoexp_bloqueadoanio_convocatoria_accesoconvocatoria_accesoaccesodes_accesosub_accesodes_subacessonota_accesonota_admision_defcentro_escolar_accesosexofecha_nacimientocod_provinciaprovinciacod_municipiomunicipio
54151181639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN1992-93JUN5Título Universitario1Titulado universitarioNaNNaNNaNH1970-02-1810.0CÁCERES256.0COLLADO DE LA VERA
54161201639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2016-17JUL5Título Universitario1Titulado universitarioNaNNaNNaNH1990-09-033.0ALICANTE310.0DÉNIA
54171211639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2018-19SEP5Título Universitario1Titulado universitarioNaNNaNNaNH1991-08-136.0BADAJOZ450.0NAVALVILLAR DE PELA
54181241639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2019-20JUN5Título Universitario1Titulado universitarioNaNNaNNaND1994-10-2610.0CÁCERES444.0MADROÑERA
54191251639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2018-19JUL5Título Universitario1Titulado universitarioNaNNaNNaNH1993-03-2310.0CÁCERES772.0TRUJILLO
54201271639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2018-19SEP5Título Universitario1Titulado universitarioNaNNaNNaND1996-11-116.0BADAJOZ740.0VILLAFRANCA DE LOS BARROS
54211281639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2019-20JUL5Título Universitario1Titulado universitarioNaNNaNNaNH1996-02-1310.0CÁCERES1.0CÁCERES
54221291639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2019-20SEP5Título Universitario1Titulado universitarioNaNNaNNaNH1992-05-106.0BADAJOZ135.0CAMPANARIO
54231301639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2020-21NOV5Título Universitario1Titulado universitarioNaNNaNNaND1997-04-1110.0CÁCERES700.0SIERRA DE FUENTES
54241311639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2020-21NOV5Título Universitario1Titulado universitarioNaNNaNNaNH1992-06-136.0BADAJOZ265.0FUENTE DEL MAESTRE